先前的工作表明,数据增强对于改善对话状态跟踪非常有用。但是,用户话语有很多类型,而先前的方法仅认为是最简单的增强方法,这引起了人们对不良概括能力的关注。为了更好地涵盖多样化的对话行为并控制发电质量,本文提出了可控的用户对话ACT扩展(CUDA-DST),以增强具有多种行为的用户话语。有了增强数据,不同的状态跟踪器会提高改进并显示出更好的鲁棒性,从而在Multiwoz 2.1上实现了最先进的性能
translated by 谷歌翻译
将电子健康记录(EHR)自动分为诊断代码对NLP社区的挑战。最先进的方法将此问题视为多标签分类问题,并提出了各种架构来对此问题进行建模。但是,这些系统并未利用验证的语言模型的出色性能,这在自然语言理解任务上实现了出色的性能。先前的工作表明,经常使用的填充方案在此任务上表现不佳。因此,本文旨在分析表现不佳的原因,并通过验证的语言模型为自动编码开发一个框架。我们通过实验发现了三个主要问题:1)大标签空间,2)长输入序列和3)域预读和微调之间的域不匹配。我们提出了PLMICD,该框架通过各种策略来应对挑战。实验结果表明,我们提出的框架可以在基准模拟数据上以多个指标来克服挑战和实现最新性能。源代码可从https://github.com/miulab/plm-icd获得
translated by 谷歌翻译
口语理解(SLU)是机器理解人类语音以进行更好互动的必不可少的任务。但是,自动语音识别器(ASR)的错误通常会损害理解表现。实际上,对于目标方案,ASR系统可能不容易调整。因此,本文着重于学习使用对比目标对ASR错误进行鲁棒性的学习话语表示,并通过结合监督的对比度学习和自我验证在模型微调中进一步增强概括能力。三个基准数据集的实验证明了我们提出的方法的有效性。
translated by 谷歌翻译
机器生成的引文句可以帮助自动化的科学文献综述和协助文章写作。生成引用文本的当前方法仅限于使用引用文档和引用文档作为输入的单引用生成。然而,在现实世界的情况下,作家往往总结一句话中的几个研究或讨论整个段落的相关信息。此外,先前已经确定了多种引用意图,这意味着作者可能需要控制产生的句子的意图,以涵盖不同的场景。因此,这项工作侧重于生成多个引用并释放名为CITEMI的新收集的数据集来推动未来的研究。我们首先使用融合的解码器方法构建一个新的一代模型来应对多个长输入。其次,我们将预测的引用意图纳入意图控制的培训。实验表明,拟议的方法提供了更加全面的功能,以产生引用句子。
translated by 谷歌翻译
对话关系提取(DRE)的目的是确定给定对话中两个实体之间的关系。在对话期间,演讲者可以通过明确或隐性的线索将其关系暴露于某些实体,这些证据称为“触发器”。但是,触发注释可能不会始终用于目标数据,因此利用此类信息来增强性能是一项挑战。因此,本文提议学习如何从数据中识别触发器注释,然后将触发功能转移到其他数据集中以提高性能。实验表明,所提出的方法能够改善看不见关系的关系提取性能,并证明我们在不同域和数据集跨不同域和数据集的触发触发模型的可传递性。
translated by 谷歌翻译
Deep learning models can achieve high accuracy when trained on large amounts of labeled data. However, real-world scenarios often involve several challenges: Training data may become available in installments, may originate from multiple different domains, and may not contain labels for training. Certain settings, for instance medical applications, often involve further restrictions that prohibit retention of previously seen data due to privacy regulations. In this work, to address such challenges, we study unsupervised segmentation in continual learning scenarios that involve domain shift. To that end, we introduce GarDA (Generative Appearance Replay for continual Domain Adaptation), a generative-replay based approach that can adapt a segmentation model sequentially to new domains with unlabeled data. In contrast to single-step unsupervised domain adaptation (UDA), continual adaptation to a sequence of domains enables leveraging and consolidation of information from multiple domains. Unlike previous approaches in incremental UDA, our method does not require access to previously seen data, making it applicable in many practical scenarios. We evaluate GarDA on two datasets with different organs and modalities, where it substantially outperforms existing techniques.
translated by 谷歌翻译
The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.
translated by 谷歌翻译
As one of the prevalent methods to achieve automation systems, Imitation Learning (IL) presents a promising performance in a wide range of domains. However, despite the considerable improvement in policy performance, the corresponding research on the explainability of IL models is still limited. Inspired by the recent approaches in explainable artificial intelligence methods, we proposed a model-agnostic explaining framework for IL models called R2RISE. R2RISE aims to explain the overall policy performance with respect to the frames in demonstrations. It iteratively retrains the black-box IL model from the randomized masked demonstrations and uses the conventional evaluation outcome environment returns as the coefficient to build an importance map. We also conducted experiments to investigate three major questions concerning frames' importance equality, the effectiveness of the importance map, and connections between importance maps from different IL models. The result shows that R2RISE successfully distinguishes important frames from the demonstrations.
translated by 谷歌翻译
Compressed videos often exhibit visually annoying artifacts, known as Perceivable Encoding Artifacts (PEAs), which dramatically degrade video visual quality. Subjective and objective measures capable of identifying and quantifying various types of PEAs are critical in improving visual quality. In this paper, we investigate the influence of four spatial PEAs (i.e. blurring, blocking, bleeding, and ringing) and two temporal PEAs (i.e. flickering and floating) on video quality. For spatial artifacts, we propose a visual saliency model with a low computational cost and higher consistency with human visual perception. In terms of temporal artifacts, self-attention based TimeSFormer is improved to detect temporal artifacts. Based on the six types of PEAs, a quality metric called Saliency-Aware Spatio-Temporal Artifacts Measurement (SSTAM) is proposed. Experimental results demonstrate that the proposed method outperforms state-of-the-art metrics. We believe that SSTAM will be beneficial for optimizing video coding techniques.
translated by 谷歌翻译
We propose a distributionally robust return-risk model for Markov decision processes (MDPs) under risk and reward ambiguity. The proposed model optimizes the weighted average of mean and percentile performances, and it covers the distributionally robust MDPs and the distributionally robust chance-constrained MDPs (both under reward ambiguity) as special cases. By considering that the unknown reward distribution lies in a Wasserstein ambiguity set, we derive the tractable reformulation for our model. In particular, we show that that the return-risk model can also account for risk from uncertain transition kernel when one only seeks deterministic policies, and that a distributionally robust MDP under the percentile criterion can be reformulated as its nominal counterpart at an adjusted risk level. A scalable first-order algorithm is designed to solve large-scale problems, and we demonstrate the advantages of our proposed model and algorithm through numerical experiments.
translated by 谷歌翻译